The Linguistics of Readability: The Next Step for Word Processing
نویسندگان
چکیده
In this paper, we present a new approach to writing tools that extends beyond the rudimentary spelling and grammar checking to the content of the writing itself. Linguistic methods have long been used to detect familiar lexical patterns in the text to aid automatic summarization and translation of documents. We apply these methods to determine the quality of the text and implement new techniques for measuring readability and providing feedback to authors on how to improve the quality of their documents. We take an extended view of readability that considers text cohesion, propositional density, and word familiarity. We provide simple feedback to the user detailing the most and least readable sentences, the sentences most densely packed with information and the most cohesive words in their document. Commonly used verbose words and phrases in the text, as identified by The Plain English Campaign, can be replaced with user-selected replacements. Our techniques were implemented as a free download extension to the Open Office word processor generating 6,500 downloads to date.
منابع مشابه
EFL Textbook Evaluation: An Analysis of Readability and Vocabulary Profiler of Four Corners Book Series
This study aimed to investigate whether there is any significant relationship between the readability and vocabulary profile including the most frequent words (K1 words) and academic word list (AWL) of reading passages of Four Corners series which were EFL textbooks. To determine the readability of the texts, the Flesch–Kincaid (1975) readability test was used, while the texts' academic word li...
متن کاملEFL Textbook Evaluation: An Analysis of Readability and Vocabulary Profiler of Four Corners Book Series
This study aimed to investigate whether there is any significant relationship between the readability and vocabulary profile including the most frequent words (K1 words) and academic word list (AWL) of reading passages of Four Corners series which were EFL textbooks. To determine the readability of the texts, the Flesch–Kincaid (1975) readability test was used, while the texts' academic word li...
متن کاملDo We Need Discipline-Specific Academic Word Lists? Linguistics Academic Word List (LAWL)
This corpus-based study aimed at exploring the most frequently-used academic words in linguistics and compare the wordlist with the distribution of high frequency words in Coxhead’s Academic Word List (AWL) and West’s General Service List (GSL) to examine their coverage within the linguistics corpus. To this end, a corpus of 700 linguistics research articles (LRAC), consisting of approximately ...
متن کاملمدلسازی بازشناسی واجی کلمات فارسی
Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...
متن کاملProducing a Persian Text Tokenizer Corpus Focusing on Its Computational Linguistics Considerations
The main task of the tokenization is to divide the sentences of the text into its constituent units and remove punctuation marks (dots, commas, etc.). Each unit is a continuous lexical or grammatical writing chain that is an independent semantic unit. Tokenization occurs at the word level and the extracted units can be used as input to other components such as stemmer. The requirement to create...
متن کامل